Goto

Collaborating Authors

 radiation therapy


Fluence Map Prediction with Deep Learning: A Transformer-based Approach

Mgboh, Ujunwa, Sultan, Rafi, Zhu, Dongxiao, Kim, Joshua

arXiv.org Artificial Intelligence

Accurate fluence map prediction is essential in intensity-modulated radiation therapy (IMRT) to maximize tumor coverage while minimizing dose to healthy tissues. Conventional optimization is time-consuming and dependent on planner expertise. This study presents a deep learning framework that accelerates fluence map generation while maintaining clinical quality. An end-to-end 3D Swin-UNETR network was trained to predict nine-beam fluence maps directly from volumetric CT images and anatomical contours using 99 prostate IMRT cases (79 for training and 20 for testing). The transformer-based model employs hierarchical self-attention to capture both local anatomical structures and long-range spatial dependencies. Predicted fluence maps were imported into the Eclipse Treatment Planning System for dose recalculation, and model performance was evaluated using beam-wise fluence correlation, spatial gamma analysis, and dose-volume histogram (DVH) metrics. The proposed model achieved an average R^2 of 0.95 +/- 0.02, MAE of 0.035 +/- 0.008, and gamma passing rate of 85 +/- 10 percent (3 percent / 3 mm) on the test set, with no significant differences observed in DVH parameters between predicted and clinical plans. The Swin-UNETR framework enables fully automated, inverse-free fluence map prediction directly from anatomical inputs, enhancing spatial coherence, accuracy, and efficiency while offering a scalable and consistent solution for automated IMRT plan generation.


Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis

Umeike, Robinson, Getty, Neil, Xia, Fangfang, Stevens, Rick

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated immense capabilities in understanding textual data and are increasingly being adopted to help researchers accelerate scientific discovery through knowledge extraction (information retrieval), knowledge distillation (summarizing key findings and methodologies into concise forms), and knowledge synthesis (aggregating information from multiple scientific sources to address complex queries, generate hypothesis and formulate experimental plans). However, scientific data often exists in both visual and textual modalities. Vision language models (VLMs) address this by incorporating a pretrained vision backbone for processing images and a cross-modal projector that adapts image tokens into the LLM dimensional space, thereby providing richer multimodal comprehension. Nevertheless, off-the-shelf VLMs show limited capabilities in handling domain-specific data and are prone to hallucinations. We developed intelligent assistants finetuned from LLaVA models to enhance multimodal understanding in low-dose radiation therapy (LDRT)-a benign approach used in the treatment of cancer-related illnesses. Using multilingual data from 42,673 articles, we devise complex reasoning and detailed description tasks for visual question answering (VQA) benchmarks. Our assistants, trained on 50,882 image-text pairs, demonstrate superior performance over base models as evaluated using LLM-as-a-judge approach, particularly in reducing hallucination and improving domain-specific comprehension.


An Oversampling-enhanced Multi-class Imbalanced Classification Framework for Patient Health Status Prediction Using Patient-reported Outcomes

Yan, Yang, Chen, Zhong, Xu, Cai, Shen, Xinglei, Shiao, Jay, Einck, John, Chen, Ronald C, Gao, Hao

arXiv.org Artificial Intelligence

Patient-reported outcomes (PROs) directly collected from cancer patients being treated with radiation therapy play a vital role in assisting clinicians in counseling patients regarding likely toxicities. Precise prediction and evaluation of symptoms or health status associated with PROs are fundamental to enhancing decision-making and planning for the required services and support as patients transition into survivorship. However, the raw PRO data collected from hospitals exhibits some intrinsic challenges such as incomplete item reports and imbalance patient toxicities. To the end, in this study, we explore various machine learning techniques to predict patient outcomes related to health status such as pain levels and sleep discomfort using PRO datasets from a cancer photon/proton therapy center. Specifically, we deploy six advanced machine learning classifiers -- Random Forest (RF), XGBoost, Gradient Boosting (GB), Support Vector Machine (SVM), Multi-Layer Perceptron with Bagging (MLP-Bagging), and Logistic Regression (LR) -- to tackle a multi-class imbalance classification problem across three prevalent cancer types: head and neck, prostate, and breast cancers. To address the class imbalance issue, we employ an oversampling strategy, adjusting the training set sample sizes through interpolations of in-class neighboring samples, thereby augmenting minority classes without deviating from the original skewed class distribution. Our experimental findings across multiple PRO datasets indicate that the RF and XGB methods achieve robust generalization performance, evidenced by weighted AUC and detailed confusion matrices, in categorizing outcomes as mild, intermediate, and severe post-radiation therapy. These results underscore the models' effectiveness and potential utility in clinical settings.


RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records

Park, Sangjoon, Wee, Chan Woo, Choi, Seo Hee, Kim, Kyung Hwan, Chang, Jee Suk, Yoon, Hong In, Lee, Ik Jae, Kim, Yong Bae, Cho, Jaeho, Keum, Ki Chang, Lee, Chang Geol, Byun, Hwa Kyung, Koom, Woong Sub

arXiv.org Artificial Intelligence

Research in context Evidence before this study We performed a comprehensive PubMed search for articles published in English up to August 1, 2024, using the search terms "radiotherapy" or "radiation therapy" in combination with "survival prediction" or "mortality prediction." This search yielded a total of 345 studies. The majority of these studies focused on survival prediction for specific cancer types, with relatively few addressing survival prediction following radiotherapy more broadly. Most of the identified studies employed statistical models requiring manually structured variables that are not easily extractable from electronic health records (EHRs). Only four studies utilized variables that could be easily extracted from EHRs for survival prediction, but these studies lacked critical information about disease status and overall patient condition, which are typically captured in unstructured EHR data. Instead, they relied on traditional, structured data such as blood test results or national registry information, or small datasets that were manually structured. Notably, no studies employed advanced flexible models, such as large language models (LLMs), to automate the structuring of unstructured data and incorporate it into survival prediction. Added value of this study Our findings suggest the potential of LLMs to process extensive unstructured data, which would be impractical for manual structuring. LLMs demonstrated high accuracy in structuring unstructured data, even without extensive tuning, using a single-shot example approach. Our study is the first to demonstrate that the appropriate application of LLMs can improve the prognosis of patients and the quality of healthcare delivery. Implications of all the available evidence The RT-Surv framework developed in this study has broad applications beyond radiation oncology. As unstructured clinical records form the basis of EHR data across all medical specialties, this framework can be adapted to reduce overall hospital mortality rates, predict length of stay, and assess complication risks. Its ability to automatically structure large volumes of unstructured data enables more accurate and efficient use of clinical data across various domains.


FedKBP: Federated dose prediction framework for knowledge-based planning in radiation therapy

Chen, Jingyun, King, Martin, Yuan, Yading

arXiv.org Artificial Intelligence

Dose prediction plays a key role in knowledge-based planning (KBP) by automatically generating patient-specific dose distribution. Recent advances in deep learning-based dose prediction methods necessitates collaboration among data contributors for improved performance. Federated learning (FL) has emerged as a solution, enabling medical centers to jointly train deep-learning models without compromising patient data privacy. We developed the FedKBP framework to evaluate the performances of centralized, federated, and individual (i.e. separated) training of dose prediction model on the 340 plans from OpenKBP dataset. To simulate FL and individual training, we divided the data into 8 training sites. To evaluate the effect of inter-site data variation on model training, we implemented two types of case distributions: 1) Independent and identically distributed (IID), where the training and validating cases were evenly divided among the 8 sites, and 2) non-IID, where some sites have more cases than others. The results show FL consistently outperforms individual training on both model optimization speed and out-of-sample testing scores, highlighting the advantage of FL over individual training. Under IID data division, FL shows comparable performance to centralized training, underscoring FL as a promising alternative to traditional pooled-data training. Under non-IID division, larger sites outperformed smaller sites by up to 19% on testing scores, confirming the need of collaboration among data owners to achieve better prediction accuracy. Meanwhile, non-IID FL showed reduced performance as compared to IID FL, posing the need for more sophisticated FL method beyond mere model averaging to handle data variation among participating sites.


A Large Language Model Pipeline for Breast Cancer Oncology

Pool, Tristen, Trujillo, Dennis

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated potential in the innovation of many disciplines. However, how they can best be developed for oncology remains underdeveloped. State-of-the-art OpenAI models were fine-tuned on a clinical dataset and clinical guidelines text corpus for two important cancer treatment factors, adjuvant radiation therapy and chemotherapy, using a novel Langchain prompt engineering pipeline. A high accuracy (0.85+) was achieved in the classification of adjuvant radiation therapy and chemotherapy for breast cancer patients. Furthermore, a confidence interval was formed from observational data on the quality of treatment from human oncologists to estimate the proportion of scenarios in which the model must outperform the original oncologist in its treatment prediction to be a better solution overall as 8.2% to 13.3%. Due to indeterminacy in the outcomes of cancer treatment decisions, future investigation, potentially a clinical trial, would be required to determine if this threshold was met by the models. Nevertheless, with 85% of U.S. cancer patients receiving treatment at local community facilities, these kinds of models could play an important part in expanding access to quality care with outcomes that lie, at minimum, close to a human oncologist.


Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV Images

Ding, Yuzhen, Holmes, Jason M., Feng, Hongying, Li, Baoxin, McGee, Lisa A., Rwigema, Jean-Claude M., Vora, Sujay A., Ma, Daniel J., Foote, Robert L., Patel, Samir H., Liu, Wei

arXiv.org Artificial Intelligence

In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imaging dose, thus unfavorable for pediatric patients. A solution to this dilemma is to reconstruct 3D CT from kV images obtained at the treatment position. Here, we propose a dual-models framework built with hierarchical ViT blocks. Unlike a proof-of-concept approach, our framework considers kV images as the solo input and can synthesize accurate, full-size 3D CT in real time(within milliseconds). We demonstrate the feasibility of the proposed approach on 10 patients with head and neck (H&N) cancer using image quality(MAE: <45HU), dosimetrical accuracy(Gamma passing rate (2%/2mm/10%)>97%) and patient position uncertainty(shift error: <0.4mm). The proposed framework can generate accurate 3D CT faithfully mirroring real-time patient position, thus significantly improving patient setup accuracy, keeping imaging dose minimum, and maintaining treatment veracity.


Understanding the PULSAR Effect in Combined Radiotherapy and Immunotherapy through Attention Mechanisms with a Transformer Model

Peng, Hao, Moore, Casey, Saha, Debabrata, Jiang, Steve, Timmerman, Robert

arXiv.org Artificial Intelligence

PULSAR (personalized, ultra-fractionated stereotactic adaptive radiotherapy) is the adaptation of stereotactic ablative radiotherapy towards personalized cancer management. For the first time, we applied a transformer-based attention mechanism to investigate the underlying interactions between combined PULSAR and PD-L1 blockade immunotherapy based on a murine cancer model (Lewis Lung Carcinoma, LLC). The proposed approach is able to predict the trend of tumor volume change semi-quantitatively, and excels in identifying the potential causal relationships through both self-attention and cross-attention scores. Introduction The field of combining radiotherapy and immunotherapy is rapidly evolving, and one aspect of particular interest is determination of optimal timing and sequence to harness the potential synergy between radiation therapy and immune checkpoint blockade.


An Overview of the Development of Stereotactic Body Radiation Therapy

Zong, Yanqi, Cui, Zhengrong, Lin, Luqi, Wang, Sihao, Chen, Yizhi

arXiv.org Artificial Intelligence

Stereotactic body radiation therapy (SBRT) refers to focusing high-energy rays in three-dimensional space on the tumor lesion area, reducing the dose received by surrounding normal tissues, which can effectively improve the local control rate of the tumor and reduce the probability of complications. With the comprehensive development of medical imaging, radiation biology and other disciplines, this less-fractional, high-dose radiotherapy method has been increasingly developed and applied in clinical practice. The background, radio-biological basis, key technologies and main equipment of SBRT are discussed, and its future development direction is prospected.


RadOnc-GPT: A Large Language Model for Radiation Oncology

Liu, Zhengliang, Wang, Peilong, Li, Yiwei, Holmes, Jason, Shu, Peng, Zhang, Lian, Liu, Chenbin, Liu, Ninghao, Zhu, Dajiang, Li, Xiang, Li, Quanzheng, Patel, Samir H., Sio, Terence T., Liu, Tianming, Liu, Wei

arXiv.org Artificial Intelligence

This paper presents RadOnc-GPT, a large language model specialized for radiation oncology through advanced tuning methods. RadOnc-GPT was finetuned on a large dataset of radiation oncology patient records from the Mayo Clinic in Arizona. The model employs instruction tuning on three key tasks - generating radiotherapy treatment regimens, determining optimal radiation modalities, and providing diagnostic descriptions/ICD codes based on patient diagnostic details. Evaluations conducted by comparing RadOnc-GPT outputs to general large language model outputs showed higher ROUGE scores in these three tasks. The study demonstrated the potential of using large language models fine-tuned using domain-specific knowledge like RadOnc-GPT to achieve transformational capabilities in highly specialized healthcare fields such as radiation oncology. However, our model's clinical relevance requires confirmation, and it specializes in only the aforementioned three specific tasks and lacks broader applicability. Furthermore, its evaluation through ROUGE scores might not reflect the true semantic and clinical accuracy - challenges we intend to address in future research.